-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[flytekit][1][SimpleTransformer] Binary IDL With MessagePack #2756
Conversation
Signed-off-by: Future-Outlier <[email protected]>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #2756 +/- ##
============================================
- Coverage 100.00% 75.84% -24.16%
============================================
Files 5 194 +189
Lines 122 19784 +19662
Branches 0 3899 +3899
============================================
+ Hits 122 15005 +14883
- Misses 0 4100 +4100
- Partials 0 679 +679 ☔ View full report in Codecov by Sentry. |
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you help me understand the usage of mashumaro please?
from flytekit.models.types import LiteralType, SimpleType, TypeStructure, UnionType | ||
|
||
T = typing.TypeVar("T") | ||
DEFINITIONS = "definitions" | ||
TITLE = "title" | ||
|
||
|
||
# In Mashumaro, the default encoder uses strict_map_key=False, while the default decoder uses strict_map_key=True. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function doesn't have anything to do with Mashumaro right? why mention mashumaro in the comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we want to use it in our mashumaro's
decoder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function will be put into _default_flytekit_decoder
.
For example, if we access dataclasss.dict_int_str
in a workflow, then we will use from_binary_idl
here to turn the Binary IDL object to Dict[int, str]
.
Note:
dict_int_str
isDict[int, str]
.Dict[int, str]
is a non-strict type
try: | ||
decoder = self._msgpack_decoder[expected_python_type] | ||
except KeyError: | ||
decoder = MessagePackDecoder(expected_python_type, pre_decoder_func=_default_flytekit_decoder) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part isn't clear to me... why are we using the MessagePackDecoder from Mashumaro? isn't that just for dataclasses?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part isn't clear to me... why are we using the MessagePackDecoder from Mashumaro? isn't that just for dataclasses?
We use MessagePackDecoder
because when accessing attributes from a dataclass, we receive a Binary IDL
from propeller, which can be any type (int, float, bool, str, list, dict, dataclass, Pydantic BaseModel, or Flyte types).
The expected flow is: Binary IDL
-> msgpack bytes
-> python val
.
MessagePackDecoder[expected_python_type].decode
is more reliable than msgpack.dumps
because it guarantees the type is always correct.
(It has expected_python_type
as a hint, and it can handle cases like
expected type: float
, actual type: int
, and convert it to float back.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a puristic pov what you're saying is true, @wild-endeavor, but given that mashumaro is already a dependency and also given the fact that msgpack.loads
is unable to unmarshal some values, I'm in favor of leaving this more complex implementation of the top-level from_binary_idl
.
Simple cases like this fail with using msgpack.loads
:
Python 3.12.5 (main, Aug 14 2024, 04:32:18) [Clang 18.1.8 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from datetime import datetime
>>> from mashumaro.codecs.msgpack import MessagePackEncoder
>>> encoder = MessagePackEncoder(type(datetime.now()))
>>> encoder.encode(datetime.now())
b'\xba2024-09-24T21:52:43.704551'
>>> msgpack.dumps(datetime.now())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/eduardo/repos/flyte-examples/.venv/lib/python3.12/site-packages/msgpack/__init__.py", line 36, in packb
return Packer(**kwargs).pack(o)
^^^^^^^^^^^^^^^^^^^^^^^^
File "msgpack/_packer.pyx", line 279, in msgpack._cmsgpack.Packer.pack
File "msgpack/_packer.pyx", line 276, in msgpack._cmsgpack.Packer.pack
File "msgpack/_packer.pyx", line 270, in msgpack._cmsgpack.Packer._pack
File "msgpack/_packer.pyx", line 257, in msgpack._cmsgpack.Packer._pack_inner
TypeError: can not serialize 'datetime.datetime' object
Signed-off-by: Future-Outlier <[email protected]>
Tracking issue
flyteorg/flyte#5318
Why are the changes needed?
These changes enable the
SimpleTransformer
to usefrom_binary_idl
when receiving binary IDL inputs. This will be particularly useful when dealing with attribute access.What changes were proposed in this pull request?
from_binary_idl
method to theTypeTransformer
class.MessagePackDecoder
to decodemsgpack bytes
into Python values._default_flytekit_decoder
inMessagePackDecoder
to support future cases, such asDict[int, str]
(by settingstrict_map_key=False
).How was this patch tested?
Unit tests, local execution, and remote execution.
local execution is tested by this PR:
[WIP] Binary IDL With MessagePack Bytes #2751
remote exeuction is tested by these 2 PRs:
[WIP] Binary IDL With MessagePack Bytes #2751
[wip] Binary IDL With MessagePack Bytes flyte#5744
Setup process
Screenshots
Check all the applicable boxes